Picture for Zixin Zhang

Zixin Zhang

HKUST

Boosting Multimodal Federated Learning via Chained Modality Optimization

Add code
Jun 01, 2026
Viaarxiv icon

Panoramic Affordance Prediction

Add code
Mar 16, 2026
Viaarxiv icon

DVD: Deterministic Video Depth Estimation with Generative Priors

Add code
Mar 12, 2026
Viaarxiv icon

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Add code
Feb 11, 2026
Viaarxiv icon

Show, Don't Tell: Morphing Latent Reasoning into Image Generation

Add code
Feb 02, 2026
Viaarxiv icon

Step-DeepResearch Technical Report

Add code
Dec 24, 2025
Viaarxiv icon

A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning

Add code
Dec 16, 2025
Figure 1 for A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning
Figure 2 for A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning
Figure 3 for A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning
Figure 4 for A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning
Viaarxiv icon

TiViBench: Benchmarking Think-in-Video Reasoning for Video Generative Models

Add code
Nov 17, 2025
Viaarxiv icon

Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks

Add code
Oct 29, 2025
Viaarxiv icon

PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs

Add code
Oct 10, 2025
Viaarxiv icon